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ABSTRACT — From the beginning of life, face and language 
processing are crucial for establishing social communica- 
tion. Studies on the development of systems for processing 
faces and language have yielded such similarities as per- 
ceptual narrowing across both domains. In this article, we 
review several Junctions of human communication, and 
then describe how the tools used to accomplish those 
functions are modified by perceptual narrowing. We con- 
clude that narrowing is common to all forms of social 
communication. We argue that during evolution, social 
communication engaged different perceptual and cogni- 
tive systems — face, facial expression, gesture, vocaliza- 
tion, sound, and oral language — that emerged at different 
times. These systems are interactive and linked to some 
extent. In this framework, narrowing can be viewed as a 
way infants adapt to their native social group. 



Olivier Pascalis, Helene Loeveiibruck, aiid Soiiia Kandel, Univer- 
site Grenoble Alpes, CNRS, LPNC UMR5105, France; Helene 
Loevenbmck, Grenoble linages Parole Signal Automatique, CNRS, 
UMR5216, France; Paul C. Quimi, Department of Psycliology, Uni- 
versity of Delaware, DE, USA; James W. Tanaka, Department of 
Psycbology, University of Victoria, Canada; Kang Lee, Institute of 
Cliild Study, University of Toronto, Canada. 

Tbis researcb was supported by a grant fi'om tbe Eunice Kennedy 
Shriver National Institute of Cliild Healtli and Hiiiiiari Development 
(ROl HD-46526). 

Correspondence concerning tliis article sbould be addressed to 
Olivier Pascalis, Laboratoire de Psycliologie et NeuroCognition, 
Universite. Grenoble Alpes, BP 47 38040, Grenoble, Cedex 9, 
France; e-mail: olivier.pascalis@upinf-grenoble.fr. 

© 2014 The Authors. Child Development Perspectims © 2014 The Society for Research in 
Child Development 

This is an open access article under the terms of the Creative Commons 
Attribution-Noncommercial License, wliich permits use, distribution and reproduction in 
any medium, provided tlie original work is properly cited and is not used for commercial 
purposes. 

DOl: 10.1111/cdep.l2064 



KEYWORDS — narrowing; face; speech 



Social life requires relationships with other group members, 
acknowledgment of their status, and communication between 
individuals. Depending on the species studied, communication 
occurs through vocalization, language, faces and their expres- 
sions, or some combination of these. Similarities observed across 
species may provide insights into the relation between different 
social communication tools and networks. Based on these obser- 
vations, we argue here that communicative tools emerged during 
evolutionary time and that current systems reflect aspects of this 
evolution. 

In humans, faces and language are essential for communica- 
tion, but they have been studied traditionally as separate areas 
with little interaction between the two domains, even when their 
links are acknowledged. In some frameworks, they even have 
been conceived of as independent cognitive modules. K faces 
provide an early channel of communication for newborns prior 
to comprehending gestural or oral language, postnatal exposure 
to the mother's voice— face combination is required to recognize 
the mother's face (Sai, 2005). In one study, moving faces were 
recognized only when sound was present (Coulon, GueUai, & 
Streri, 2011). Thus, face processing seems to be facilitated by 
voice processing, even at an early age. 

Later, in early childhood, most conversations take place face- 
to-face. Although auditory information alone is sufficient to 
understand speech, we rely systematically and unconsciously on 
visual information provided by a speaker's face. Seeing oro-facial 
gestures of the speaker accelerates recognition of core words 
(Fort et al., 2012) and enhances intelligibility in noisy environ- 
ments (Benoit, Mohamadi, & Kandel, 1994). Therefore, most 
human conversations — except when we are on the phone — 
invoke analyzing facial configurations to locate cues relevant 
to decode speech. Thus, the integration of audio and facial 
information is crucial to speech perception. 
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These observations point to a close link between face and 
language processing that, we argue, may reflect how social 
communication evolved and how it develops in infants and chil- 
dren. More specifically, functional links between gestural and 
oral communication in nonhuman primates as well as infants 
suggest that social communication is a multimodal system, 
involving manual and visuo-facial gestures as well as vocaliza- 
tion. This multimodal system is gradually tuned during develop- 
ment, with narrowing occurring in all the different modalities of 
communication. 

FACE PROCESSING, LANGUAGE PROCESSING, AND 
DEVELOPMENT 

Human adults can recognize familiar faces easily and are said 
to process faces expertly. Faces form a category of stimuli that 
are homogenous in terms of the positioning of their internal ele- 
ments, and humans have developed a signatm^e way to discrimi- 
nate them based on configural (i.e., relational) information, such 
as the distance between the eyes or between lips and chin. 
Experience likely plays a critical role in acquiring face exper- 
tise (Lee, Anzures, Quinn, Pascalis, & Slater, 2011). 

Language is a key tool for social communication because it 
allows for transmitting complex information that facial expres- 
sions cannot. It is a complex cognitive skill requiring recursion 
and displacement (Chomsky, 1965), yet children acquire it 
swiftly and without instruction, whereas most adults find learn- 
ing a second language challenging. Studies of language acquisi- 
tion have discovered crucial milestones: Vocalizations are 
observable at birth, babbling emerges at around 6-8 months, 
children utter their first words at 10-12 months, and they begin 
to make word combinations and form proto-sentences at around 
20-24 months (Vihman, 1996). 

Studies of the development of the systems that process faces 
and language have identified similarities between the two. 
Face processing develops during the first years of life from a 
broad nonspecific system to a human-tuned face processor 
(Nelson, 2001). Faces observed within the infants' visual envi- 
ronment shape and influence the developing face system 
through a process known as perceptual narrowing: a progres- 
sion whereby infants maintain the ability to discriminate 
stimuli to which they are exposed, but lose the ability to 
discriminate stimuli to which they are not exposed. This 
course of responsiveness is similar for language development. 
In the first year, initial discriminatory ability reflecting a uni- 
versal sensitivity to the sounds of all human languages narrows 
as a consequence of predominant exposure to one's native lan- 
guage and scarce exposure to other languages (Werker & Tees, 
1999). During this time, infants become tuned to their native 
language and the distribution of phonetic information in the 
ambient language at the expense of discriminating nonnative 
contrasts. In other words, infants become experts at processing 
frequently experienced faces and native sounds. 



Narrowing cuts across both visual and auditory modalities, 
possibly reflecting the development of a common neural archi- 
tecture (Scott, Pascalis, & Nelson, 2007). Narrowing could be a 
pan-sensory process; that is, the same phenomenon is observed 
in various senses during the same period and is part of the 
development of our multisensory representation of the world 
(Lewkowicz & Ghazanfar, 2009). This line of thinking raises 
questions such as: Is perceptual narrowing amodal? Is auditory 
narrowing linked to visual narrowing? 

One argument for the link between the development of face 
and language processing comes from neuroanatomy. The supe- 
rior temporal sulcus (STS) is associated with face processing 
and auditory representation of speech components (Demonet, 
Thierry, & Cardebat, 2005; Haxby, Hoffman, & Gobbini, 2000). 
The posterior part of the STS may be considered an amodal con- 
vergence zone that plays a key role in integrating face and voice 
information (Belin, Bestelmeyer, Latinus, & Watson, 2011). 
These findings suggest similar, interacting, and common brain 
circuits for processing faces and speech. 

Descriptions of narrowing fail to consider the evolution and 
timing of when face and language processing emerged. What 
drives or motivates the development of both face and language 
processing is the urge to communicate. In the rest of the article, 
we describe several functions of human communication, then 
explain how perceptual nan-owing modifies each of these, and 
conclude that narrowing is a common characteristic of all social 
communication. 

GESTURAL AND ORAL COMMUNICATION 

Human language is described as unique even though some form 
of communication exists in other species. Understanding the 
emergence of language during evolution is a challenge, as fossil 
evidence does not provide much insight into oral language. Two 
means of communication are seen as potential precursors to 
human language — vocal calls and gestures — although it is debat- 
able whether language originated in manual gestures or evolved 
exclusively in the vocal domain. The former hypothesis considers 
pointing as the initial means to communicate, which later devel- 
oped into a gestural language. Language may have evolved from 
manual gestm'es, and then gradually incorporated vocal ele- 
ments, so that language involves reciprocity in the actions of part- 
ners (Corballis, 2003). The mechanism could be supported by 
mirror neurons, located in Broca's area in humans (Buccino 
et al., 2001). This area is involved with vocalization as well as 
manual action and could have been used as a neural substrate 
for interspecific communication and then to process speech. 

In addition, gestures, and more specifically pointing, are 
associated closely with language development (Kita, 2003). Ocu- 
lar pointing (or deictic gaze, at 6—9 months) and later index finger 
pointing (deictic gesture, at 9-11 months) are key stages in cogni- 
tive development that are correlated with stages in speech devel- 
opment. Finger pointing is associated with learning new word 
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forms and their associated meanings, and when accompanied by 
word production (at 16-20 months), fosters the emergence of sen- 
tences. At later stages, children start using prosodic focus, that is, 
vocal pointing (Menard, Lcevenbruck, & Savariaux, 2006), or con- 
structions involving a deictic pronoun (Diessel & Tomasello, 
2000). Different pointing modalities may share a common cerebral 
network: Ocular, digital, and prosodic pointing are associated with 
left parietal activation (Loevenbruck, Dohen, & Vilain, 2009). 
These findings suggest a link between gesture and language. 

However, the referential and combinatorial properties of pri- 
mate vocal communication suggest that language is also rooted 
in vocalization (Arnold & Zuberbiihler, 2008): Chimpanzees 
produce and understand functionally referential calls, such as 
an alarm call for a snake, and monkeys can combine existing 
calls into higher order meaningful sequences. Furthermore, syl- 
lables may derive from cycles of rhythmic opening and closing 
of the jaw involved in chewing, sucking, and licking, which take 
on communicative significance as lip smacks, tongue smacks, 
and teeth chatters (MacNeilage, 1998). These observations sug- 
gest a direct evolutionary trajectory from primate vocalizations 
to human speech rather than a complex route requiring an inter- 
mediate stage of gestural communication. 

Our view is that functional links between gestural and oral 
communication, observed in nonhuman primates and infants, 
suggest that communication is a multimodal system involving 
manual and visuo-facial gestures as well as vocalization. Human 
communication may have switched to oral-dominant language for 
several reasons, including accessibility without seeing the other 
person (e.g., at night or from a distance) and accessibility while 
doing something else with the forelimbs (e.g., carrying or using 
tools; Corballis, 2003). Humans would have gradually used the 
oro-facial region more than the hand in communicating. 

Clearly, different kinds of communication existed before oral 
language, including vocalizations, facial expressions, and visuo- 
facial gestures. These findings highlight the strong phylogenetic 
and ontogenetic links between face and language processing. 

NARROWING ACROSS DOMAINS THAT INVOLVE 
SOCIAL COMMUNICATION 

Faces 

Although 6-month-olds recognize different races of human faces 
as well as different monkey faces, 9- to 10-month-olds recognize 
reliably only faces of their own species and race (for a review, 
see Lee et al., 2011). Successful social communication relies on 
our ability to process information that allows us to identify peo- 
ple with whom we interact, such as identity, age, and gender. 
Specialization for faces of our own race improves our ability to 
extract such information. Regarding voice recognition, 7-month- 
olds detected changes in voice only when the language was in 
their native tongue (Johnson, Westrek, Nazzi, & Cutler, 2011), 
suggesting that voice recognition develops in pace with increas- 
ing competence in language processing. However, younger 



infants' ability has not yet been reported and we, therefore, can- 
not conclude that narrowing has occurred in this domain. 

In addition to recognizing faces, infants also learn to recognize 
facial expressions, which further feeds into their abilities to com- 
municate socially (Quinn et al., 2011). Perceptual narrowing has 
been found for recognizing emotions in 9-month-old infants, but 
only for faces of their own race (Vogel, Monesson, & Scott, 2012), 
suggesting that perceptual narrowing affects stimuli that are 
important for communication with conspecifics and in-groups. 

Audiovisual Speech 

By the end of the first year of life, responsiveness to nonnative 
audiovisual inputs declines both in sound— face matching for 
other species and in nonnative language (Lewkowicz & Ghazan- 
far, 2009; Pons, Lewkowicz, Soto-Faraco, & Sebastian-Galles, 
2009). In a study that used silent video clips of a bilingual 
speaker telling a story in two languages, monolingual 4- and 6- 
month-olds discriminated visually between the two languages, 
whereas monolingual 8-month-olds did not (Weikum et al., 
2007). The link between face and language processing is also 
illustrated by research in which infants watched and listened to 
a female speaking their native language or a nonnative lan- 
guage. Four-month-olds looked more at the eyes, 6-month-olds 
looked equally at the eyes and mouth, and by 8 months, infants 
shifted their attention to the mouth, regardless of the language 
spoken. These findings suggest that infants begin to focus on the 
mouth of a talker precisely when they start babbling (Lewkowicz 
& Hansen-Tift, 2012). In contrast, 12-month-olds no longer 
focused on the mouth when exposed to native speech, but con- 
tinued to look more at the mouth when exposed to nonnative 
speech (Kubicek et al., 2013; Lewkowicz & Hansen-Tift, 2012). 

Music— Rhythm 

Music is important for communication and may be involved in 
comforting, courtship, movement coordination, and social cohe- 
sion (Brown, 2003). It requires social skills, such as vocal/ges- 
tural imitation, and involves cultural transmission. It may even 
be considered a form of oral communication that emerged before 
language (Fitch, 2006). ff narrowing happens for any form of 
communication, it should also occur for music. Indeed, in one 
study, 6-month-olds were able to discriminate rhythms specific 
to their culture and those unfamiliar to them; however, 
12-month-olds could do so only with a rhythm specific to their 
culture (Hannon & Trehub, 2005). Furthermore, early and active 
exposure to culture-specific music rhythms and tonalities may 
accelerate perceptual narrowing in music (Trainor, Marie, Gerry, 
Whiskin, & Unrau, 2012). 

Auditory Speech 

NaiTowing of speech perception is also well documented. 
Infants' speech perception becomes tuned toward their native 
language at around 10—12 months. Young infants discriminate 
fine phonetic differences, such as differences in voice onset 
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time, between consonants such as /pa/ and /ba/ (Eimas, Sique- 
land, Jusczyk, & Vigorito, 1971). Infants are also able to dis- 
criminate vowels (e.g., between la] and lil or /i/ and luJ; Trehub, 
1973). Not only can infants younger than 6—8 months discrimi- 
nate categorically native phonetic contrasts, they can also dis- 
criminate those that fall outside their native language. For 
example, 6- to 8-month-olds who are learning English can dis- 
criminate the nonnative dental/retroflex contrasts such as the 
Hindi /Ta/ versus /ta/ (Werker & Tees, 1999). However, a 
decline in cross-language consonant perception occurs at 10- 
12 months. Younger children can discriminate many phonetic 
differences, whereas older children lose this ability for contrasts 
that fall outside their native language. Therefore, phonetic dis- 
crimination starts as language general but gradually narrows, 
showing language-specific tuning. 

Sign Language 

Narrowing has also been observed in perceiving sign language 
(Palmer, Fais, Golinkoff, & Werker, 2012). Hearing infants are 
able to discriminate American Sign Language (ASL) signs at 
4 months but not at 14 months, whereas infants learning ASL 
are stiU able to discriminate signs at the later age. This result 
suggests that narrowing happens for language regardless of the 
whether the support is gestural or oral. 

NARROWING AS A CATEGORIZATION PROCESS 
SERVING SOCIAL NEEDS 

Our view is that narrowing occurs for different cognitive abilities 
commonly involved in communication, even though not all evi- 
dence uniformly shows that naixowing occurs simultaneously 
across different domains (see, e.g., Hayden, Bhatt, Kangas, Zie- 
ber, & Joseph, 2012, for evidence of own-race specialization 
several months before language narrowing). Therefore, the 
underlying mechanism might not be specific to one cognitive 
ability, but common to all communicative tools. In terms of evo- 
lution, it emerged fii'st for processing faces and facial expres- 
sions, and therefore, should have been part of primitive 
language involving rhythm and gestures before becoming part of 
oral language. 

Concomitant occurrence in multiple modalities does not 
explain why nan-owing happens. Our take is that infants are born 
into a social group that has developed a culture of communication 
that is unique, opaque (i.e., association between an oral/gestural 
sign and a referent may be arbitrary), and subject to evolution. 
The most effective way to integrate within the group may be to 
adapt rapidly to the group's social habits and communication tra- 
ditions. During the first 12 months, when infants mainly interact 
with the mother/caregiver, they have to learn rapidly the appropri- 
ate way of communicating when interacting within the social 
group. The mother/caregiver transmits the basic aspects of com- 
munication that are crucial to being part of the community: 
smiles, language characteristics, and recognition of specific faces. 



The child then calibrates its communication systems using 
learning abilities including imitation. K the child is exposed to 
several individuals, he or she uses convergence mechanisms to 
calibrate the system and ends up with finely tuned representa- 
tions of the faces in the environment as well as detailed repre- 
sentations of the phonemes and prosodic patterns in the ambient 
language(s). 

By this account, narrowing is a categorization process that 
serves social needs. In the language domain, infants build a 
broad category including the nonnative contrasts that are lost, 
and retain tightly tuned categories for native contrasts. In the 
same way, in the face domain, infants build a large category for 
other-race faces including multiple other-race face categories 
(e.g., for infants exposed mainly to Caucasian faces, this cate- 
gory would include Asian and African faces), and build tightly 
tuned categories organized around subordinate-level identity 
information for same-race faces (i.e., Olivier vs. Helene vs. 
Paul). Therefore, narrowing can be conceived of as a system that 
allows the infant to become more efficient or specialized for the 
social stimuli at hand in the close environment. 

CONCLUSION 

In this article, we have argued that perceptual narrowing should 
be observed for all forms of social communication. During evolu- 
tion, our social communication used different perceptual and 
cognitive systems — ^face, facial expression, gesture, vocalization, 
sound, and oral language — that emerged at different times. 
These systems are interactive in adults and their neural mecha- 
nisms are linked to some extent. Their development presents 
similarities as infants adjust to their native social group. 

We suggest that the adaptation is accomplished through a 
specific mechanism dedicated to social cognition, which encom- 
passes the different modalities of communication, including 
manual and visuo-facial gesture processing, as well as vocaliza- 
tion processing abilities. However, we are uncommitted to 
whether such a mechanism is part of the core endowment pres- 
ent at birth or is a product of increasing specialization that 
occurs with development. Behavioral and neuroimaging studies 
should look at the intertwining of the development of these 
social abilities. Our suggestion also pertains to the field of neu- 
rological or developmental disorders: We predict that deficits in 
either the development of manual gesture processing, facial ges- 
ture processing, or vocalization processing should result in dis- 
orders of social communication. This prediction is supported by 
work on autism spectrum disorders suggesting that social com- 
munication strongly relies on the healthy development of these 
different abilities (Adolphs, Sears, & Piven, 2001; Baron-Cohen, 
1989). Although further work is needed to understand this mul- 
timodal adaptation process, our account is that the interplay of 
systems that process faces and language in the development of 
social communication underlies the occurrences of perceptual 
narrowing in different domains. 
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